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ABSTRACT 

The simplicity of standard score sysl s, percentile 
equivalents, and their relation to tlie ideal normal di -cribution are 
discussed and illustrated. Standard scores are z-scores, the 
T-scores, College Entrance Examination Board scores, and Army General 
Classification Test scores. A derivarive of the general standard 
score system is the stanine plan, which divides the norm population 
into nine groups and nine percentages which indicates the percent of 
the total population in each of the stanines. Interpretation of the 
Wechsler scales depends on a kiiowledge of standard scores. A 
subject's raw score on each of the subtests in these scales is 
converted, by appropriate norm tables, to a standard score, based on 
a mean of 10 and a standard deviation ox 3. The sums of standard 
scores on the Verbal Scale, the Performance Scale, and the Full Scale 
are then converted into IQ«s. These IQ«s based on a standard score 
mean of 100; the standard deviatn of the IQ*s set at 15 points. IQ's 
of the type used in the Wechsler scales are known as "deviation IQ*s*» 
as contrasted with the IQ's developed from scales in which a derived 
mental age is divided by chronological age. (For related document, 
see TM 002 944.) (DB) 
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METHODS OF EXPRESSING TEST SCORES 

N individuals test score acquires meaning when it can be compared with the scores of well-identified 
groups cf people. Manuals for tests provide tables of norms to make it easy to compare individuals and 
groups. Several systems for deriving more meaningful "standard scores" from raw scores have been widely 
adopted. All of them reveal the relative status of in'dividuak within a group* 

lae fundamental equivalence of the most popular standard score systems is illustrated in the chart on the 
^next page. We hope the chart and the accompanying description will be useful to counselors, personnel officers 
w •clinical diagnosticians and others in helping them t^show the uninitiated the essential simplicity of standard 
score systems, percentile equivalents, and theirprelation to the ideal normal dis*rib"tion. 

^7 Sooner or later, every textbook discussion of test 
^scores introduces the bel!-shaped normal curve. The 
student of testing soon learns that many of the methods 
of deriving meaningful scores arc anchored to tlie 
dimensions ancl characteristics of this curve. And he 
learns by observation of actual test score distributions 
that the ideal mathematical curve is a reasonably 
good approximation of many practical cases, He learns 
Jj^** to use the standardized properties of the ideal curve 
as a modeL 



Let us look first at the curve itself. Notice that there 
are no raw scores printed along the baseline. The 
graph is generalized; it describes an idealized dis- 
tribution of scores of any group on any test. We are 
free to use any numerical scale we like. For any par- 
ticular set of scores, we can be arbitrary and call the 
average score zero. In technical terms we "equate*' 
the mean raw score to zero. Similarly we can choose 
any convenient number, say 1.00, to represent the 
scale distance of one standard deviation.' Thus, if a 
distribution of scores on a particular test has a mean 
of 36 and a standard deviation of 4, the zero point on 
tlie baseline of our curve would be equivalent to an 
original score of 36; one unit to the right, -f l<r, would 

^The mathematical symbol for the standard deviation is the 
lower case Greek letter sigma or #. These temis are use-l inter- 
changeably in this article. 



be equivalent to 40, (36 + 4); and one unit to the left, 
—la, would be equivalent to 32, (36 — 4), 

The total area under tlie curve repi ssents tie total 
number of scores in the distribution. Vertical hues 
have been drawn through the score scale (the baseline) 
at zero and at 1, 2, 3, and 4 si^jma units to the right 
and left. These lines mark oft subareas of the total area 
under the curve. The numbers printed in these sub- 
areas are per cenis-^percetitages of the total number 
of people. Thus, 34.13 per cent of all cases in a normal 
distribution have scores falling between 0 and —la. 
For practical purposes we rarely need to deal with 
standard deviation units below ^ or above +3; the 
percentage of cases with scores beyond ±Qa is negli- 
gible. 

The fact that 68.26 per cent fall between ±la gives 
rise to the common statement that in a normal dis- 
tribution roughly two-thirds of all cases lie bet\veen 
plus and minus one sij^ma. This is a rule of thumb 
every test user should keep in mind. It is very near 
to the theoretical value and is a useful approximation. 

Below the row of deviations expressed in sigma 
units is a row of per cents; these show cumulaiiveltf 
the percentage of people which is included to the 
left of each of the sigma points. Thus, starting from 
. the left, when we reach the line erected above — 2<r, 



The contents of this Bulletin arc not copyrighted; the articles may be quoted or reprinted without formality other 
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SISLf ^^^tf^i^^h H It'*^ ''^ ^''"i^'^ T ^" exampk, both 600 on the CEEB and 1 20 on the AGCT are 

of^ftand^rddevuiUonabovetharrespccticemeantMi they do not fcpresent-equar standings because the scores wcre^^^^ 

we have included the lowest 2.3 per cent of cases. 
These percentages have been rou/ided in the next row. 



Note some other relationships: the area between the 
dbla points includes the scores which lie above the 
16lh percentile (-la) and below the S4th percentile 
( +la) < — two major reference points all test users 
should know. When we find that an individual has a 
score l<r above the mean, we conclude that his score 
ranks at the S4th percentile in the group of persons 
on whom tlie test was nornied* (This conclusion is 
good provided we also add this clause, at least sub- 



vocally: if this particular group reasotiably approxi- 
mates the ideal normal model) 

The simplest facts to memorize about the normal 
distribution and the relation of the percentile system 
to deviations from the average in sigma units are seen 
in the chart. They are 



Deviation from 
the mean 

Percentile 
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To avoici cluttering the graph reference lines have 
not been drawn, but we <Nnild mark off ten per eent 
sections of area under tJic normal curve by drawing 
lines vcitically from the indieated decile points (10, 
20, . . . SO, 90) up through the gniph. The reader 
might do this lightly with a colored pencil. 

We can readily see that t(m p(^r cent of the area 
(people) ai the middle of ihc distiibution embraces 
a sinallor distance on the baseUne of the curve than 
ten per cent of the area (people) at tlu; ends of the 
range of scores, for die simple reason that the curve 
is much hifjhcr at the middle. A person wlio is at the 
95th percentile is farther away from a person at the 
85th percentile in units of test score tlian a person at 
the 55th percentile is frotn one at Oxe 45th percentile. 

The remainder of the chart, ^Kat_is die several 
scoring scales drawn parallel to the baseline, illustrates 
variations of the deviation score principle. As a class 
these are called standard scores. 

First, tliere are the z-scores. Tliesc are the same 
numbers as shown tlie baseline of the graph; tlie 
only difference is that the expression, a, has been 
omitted. Iliese scores nm, in practical terms, from 
—3.0 to +3.0. One can compute them to more decimal 
places if one wishes, althoueh computj5;g to a single 
decimal place is usually suOicient. One can compute 
2-scores by equating tlie mean to 0.00 and the stand- 
ard deviation to 1.00 for a distribution of any shape, 
but the relationships shown in this figure between 
the z-score equivalents of raw scores and percentile 
equivalents of raw scores are correct only for normal 
distributions. The interpretation of standard score 
systems derives from ihe idea of using the normal 
curve as a model 

As .can be seen, T-scorcs are directly related to 
z-scores. The tncan of the raw scores is equated to 50, 
and the standard deviation of the raw ?-cores is equated 
to 10. Thus a 2-score of -rl.5 means the same as a 
T-score of (55. T-scores are usually expressed in whole 
numbers from about 20 to SO. The T-score plan elimi- 
nates negative numbers and thus facilitates many 
computations.^ 

The College Entrance Examination Hoard uses a 
plan in which both decirnals and negative numbers 
are avoided by setting the arbitrary mean at 500 
points and tlie arbitrary sigma at anotlicr convenient 
unit, namely, 100 points. *J1ie experienced tester or 
counselor who hears of a College Hoard SAT-V score 
of 550 at once thinks, "Half a .«jigma (50 points) above 
average (500 points) on the CEEB basic norms." 

T-scofcs and percentiles both Imvc 50 as the main reference 
point, an occasional source of confusion to tlio^c uho do not 
insist on careful labelling of data and of scurc? of individuals 
in their records. 



And when h" hears of a score of 725 on S.\T-N, he 
can interpret, "Plus 2Jio. 'i'hcefore. better thia the 
9Sth percentile." 

During World ./ar 11 the Navy used the T-score 
plan of rcpoi;Jiig test status. The Army used still 
another system with a mean ^f 100 and a standard 
deviation cf 20 points. 

Another d^r^vative of the general standard score 
system is the stanine plan, dexekped by psychologists 
in the Air Force during the war. The plan divides the 
norm population into n*nc groups, hence, "^standard 
nines." Excq)*^ for stanine 9, the top, and stanin? 1, 
the bottom, these groups are spaced in lialf-sigma 
units. Thus, stanine 5 is defined as including the 
people who arc within ±0.25<t of the me; n. Stanine 6 
is tlie group defined Ly Uic half-^igma distance on the 
baseline between +0.2B<t and -fO.Tocr. Stanines 1 and 
9 include all persons who are below — 1.75<f and above 
+1.75<r, respectively. The result is a distribution in 
which the mean is 5.0 and the standard deviation is 
2.0. 

Just below tlie line showing the demarcation of the 
nine groups in the staniue system there is a row of 
percentages which indicates the per eent of the total 
population in each of the stanines. Thus 7 per cent of 
the population will be in stanine 2, and 20 per eent in 
the middle group, starJne 5. 

Interpretation of the Weehsler scales (W-B I, W-B 
II, Wise, and WAIS) dcpcr.-^s on a knowledge of 
standard scores. A subject's raw score on each of the 
subtests in these scales is converted, by appropriate 
norms taLIes, to a standard score, based on a mean 
of 10 and a standard deviation of 3. The sums of 
standard scores on the Verbal Scale, the Performance 
Scale, and the Full Scale are then converted into IQs. 
These IQs are based on a standard score mean of 100, 
the conventional number ''or representing the IQ of 
the average person in a given age group. The stand- 
ard deviation of the IQs is set at 15 points. In practi- 
cal terms, then, roughly tvvo-thirds of the IQs are 
between 85 and 115, that is, ±laJ IQs of the type used 

^Evcry once In a while we receive a letter from someone who 
suggests th:«t the Wcchslcr scales ought to gcncr?.c a wider 
range of IQ*-. The reply is very simple. If we want a wider 
range of IQs all we have to do is to choose a larger arbitrarif 
standard deviation, say, '20 or 25. Under the present system, 
d:3<r gives IQs of 55 to 145, with a few rare cases bclo\y and 
a few rare eases above. If we used 20 as the standard deviation, 
wc would arbitrarily increase the d:3<r range of IQs from 55* 
145 to 40-160. This is a wider range of numbers! But. test 
users should never forget tliat adaptations of this kind do not 
eiiangc the responses of the people who took the tcbt, do not 
change the order of the persons in relation to each other, and 
do not change the psychological meaning attached to an IQ. 
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